Nebius

Solutions Architect · Onboarding Program

Portfolio by Rus Teston

SA Enablement · Structured Onboarding

90-Day SA
Onboarding Blueprint

A week-by-week program designed to take a new Nebius Solutions Architect from Day 1 orientation through certified customer-ready status — with technical depth, measurable milestones, and a built-in manager coaching guide at every phase.

12 Weeks total

3 Phases

3 Cert gates

~52h Seat time

Program Timeline · Click any week to jump to that week's detail card

Phase 01 Foundation Weeks 1–2 · ~12 hrs Platform orientation, product portfolio, people and systems access. By end of Phase 1, the SA can explain Nebius AI Cloud to a colleague with confidence.

Phase 02 Practitioner Weeks 3–6 · ~24 hrs Deep technical build — GPU clusters, orchestration, storage, MLOps, competitive positioning, and TCO modeling. By end of Phase 2, the SA can architect and defend a Nebius solution independently.

Phase 03 Expert Weeks 7–12 · ~16 hrs Live deal involvement, customer-facing demos, Token Factory inference architecture, capstone design, and final certification. By end of Phase 3, the SA is independently customer-ready.

Wk 1

Orientation & access

Wk 2

✦

Platform overview

Wk 3

GPU deep dive

Wk 4

Orchestration & storage

Wk 5

MLOps & competitive

Wk 6

✦

TCO & gate 2

Wk 7

Token Factory

Wk 8

First customer call

Wk 9

Demo mastery

Wk 10

Solution design

Wk 11

Capstone build

Wk 12

✦

Certification

Week-by-Week Program · Click any card to expand

Week 01 Orientation, Access & the Nebius Story Foundation

Key Activities

Complete HR onboarding, systems access, and equipment setup — Nebius console, GitHub, Slack, email, Okta SSO

Day 1 meeting with SA Manager — review 90-day program structure, expectations, and success criteria

Complete Nebius Company Story module — history, mission, Nasdaq listing, $700M raise, NVIDIA Reference Platform Partner status

First read-through of AI Cloud and Token Factory product pages at nebius.com

Meet your assigned buddy SA — schedule two job-shadow sessions for Weeks 2–3

People to Meet

SA Manager — 90-day program review and expectations alignment

Buddy SA — assigned peer for technical mentorship throughout the program

HR Onboarding — systems, benefits, and compliance training

IT / DevOps — Nebius console access, sandbox environment provisioning

Deliverables Due This Week

Nebius console access confirmed GitHub + Slack onboarding complete 90-day program schedule agreed with manager Buddy SA shadow sessions booked

👤

Manager Action — Week 1

Conduct a Day 1 welcome meeting covering the 90-day program structure, success criteria, and how you'll assess readiness at each gate. Confirm all systems access is provisioned before end of Day 1. Assign buddy SA and brief them on expectations for mentorship.

By end of Week 1, the SA can...

Access all Nebius systems, explain the Nebius company narrative and funding story, and describe the 90-day program structure to a peer.

Week 02 AI Cloud Architecture & Product Portfolio Foundation ✦ Gate 1

Key Activities

Complete AI Cloud Architecture Overview module — full-stack platform walk-through including data center topology (US, Finland, France, Iceland)

Read and study the SemiAnalysis TCO study — understand how Nebius achieved Gold Medal ClusterMAX rating across all three modeled workloads

Product portfolio deep-read: AI Cloud vs Token Factory — when to recommend each, how they complement

Review Trust Center documentation — SOC 2, HIPAA, GDPR, ISO 27001 compliance architecture

Gate 1 Assessment: 30-minute architecture walkthrough with SA Manager — explain the Nebius platform, product portfolio, and data center footprint from memory

People to Meet

SA Lead / Principal SA — architecture overview session and Q&A

Product Marketing — understand how Nebius's positioning is crafted and what materials are available

Buddy SA — first job-shadow session on a real customer call

Deliverables Due This Week

✦ Gate 1: Architecture walkthrough passed SemiAnalysis TCO study notes submitted Product portfolio one-pager (personal notes) First buddy SA job-shadow completed

👤

Manager Action — Week 2 (Gate 1)

Conduct the 30-minute Gate 1 architecture walkthrough. Evaluate: can the SA explain the Nebius platform unprompted? Do they understand the product portfolio distinction (AI Cloud vs Token Factory)? Can they articulate the data center footprint? Pass/fail with written feedback. If passing, advance to Phase 2 on schedule. If not, add a targeted remediation session before Week 3 begins.

By end of Week 2, the SA can...

Deliver a coherent, unprompted walk-through of Nebius AI Cloud architecture, the product portfolio, and the compliance posture — without referring to notes. Gate 1 passed.

Week 03 NVIDIA GPU Portfolio & Hands-On Provisioning Practitioner

Key Activities

GPU tier deep-dive: H100, H200, HGX B200, HGX B300, GB200 NVL72, GB300 NVL72 — architecture differences, memory specs, and use case fit

Hopper vs Blackwell architecture comparison — when to recommend which generation and why

Hands-on Lab 1: Provision an HGX H100 instance via Nebius console, then via CLI, then via Terraform — document the differences in provisioning time and configuration options

Run a basic CUDA workload and benchmark GPU utilization — build a personal reference for what MFU looks like in practice

Second buddy SA job-shadow — focus on how the buddy handles GPU-related customer questions in real calls

People to Meet

Infrastructure Engineering Lead — GPU architecture context and cluster design principles

Buddy SA — second job-shadow session; debrief on GPU positioning in customer conversations

Deliverables Due This Week

Lab 1 completed: H100 provisioned via console, CLI, and Terraform GPU tier comparison cheat sheet (personal reference) MFU benchmark results documented

👤

Manager Action — Week 3

Review the SA's GPU tier comparison cheat sheet — are the use case descriptions accurate and customer-friendly? Is the SA distinguishing Hopper from Blackwell correctly? Check in at end of week: is the SA comfortable in the Nebius console independently? Note any gaps for focused remediation in Week 4.

By end of Week 3, the SA can...

Match any customer workload description to the correct Nebius GPU tier without hesitation, provision a GPU instance via three different methods, and explain MFU in customer-friendly language.

Week 04 Orchestration — Managed Kubernetes & Slurm (Soperator) Practitioner

Key Activities

Managed Kubernetes deep-dive — topology-aware scheduling, node health monitoring, auto-repair for fault-tolerant training

Soperator (Slurm) architecture — when customers choose Slurm over Kubernetes and why. HPC workload patterns vs ML training patterns

Hands-on Lab 2: Deploy a multi-node Kubernetes cluster on Nebius — configure a training job with topology-aware scheduling, simulate a node failure, and observe auto-repair behavior

InfiniBand networking module — NVIDIA Quantum-X800 InfiniBand fabric, non-blocking architecture, and what it enables for distributed training at scale

High-performance storage architecture — up to 1 TB/s read throughput for shared filesystems, 2 GB/s per GPU for object storage. WEKA and VAST Data integrations.

People to Meet

Platform Engineering — Kubernetes team — cluster architecture session and lab guidance

Storage engineering lead — storage architecture walk-through and customer scenario discussion

Deliverables Due This Week

Lab 2: Multi-node K8s cluster deployed and job scheduled Node failure recovery documented and timed InfiniBand vs standard networking one-liner (personal notes)

👤

Manager Action — Week 4

Ask the SA to walk you through the K8s lab results as if presenting to a customer CTO. Can they explain fault-tolerant training in plain language? Do they understand when to recommend Kubernetes vs Slurm? This is the most technically demanding week — flag any major gaps immediately to allow mid-program adjustment.

By end of Week 4, the SA can...

Architect a multi-node training cluster on Nebius, explain fault tolerance to a customer engineering team, and articulate the InfiniBand and storage architecture story without referring to documentation.

Week 05 MLOps Stack, Managed Services & Competitive Positioning Practitioner

Key Activities

Managed MLflow — deploy, configure, and run an experiment tracking session. Understand model registry, artifact storage, and experiment comparison in the Nebius environment

Apache Spark on Nebius Managed Services — data pipeline architecture for ML preprocessing at scale

PostgreSQL managed service — metadata persistence patterns for production ML systems on Nebius

Competitive positioning deep-dive — Nebius vs AWS SageMaker, GCP Vertex AI, Azure ML, and CoreWeave. Differentiation framework: bare-metal MFU, availability speed, AI-native support, TCO.

Study real customer scenarios: Brave Search (inference), Recraft (training), Wubble (fine-tuning) — understand what drove each customer to Nebius and what they built

People to Meet

Product Marketing lead — competitive intelligence briefing and messaging alignment

Customer Success — Brave Search and Recraft account context; what drove the decisions

Sales lead — understand how AEs position Nebius and where SAs typically enter the conversation

Deliverables Due This Week

MLflow experiment deployed and documented Competitive displacement one-pager (personal notes) Customer scenario summary: Brave, Recraft, Wubble

👤

Manager Action — Week 5

Run a mock competitive objection session — present the SA with "we're already on AWS" and evaluate the response. Is the displacement conversation natural and credible? Does the SA use the SemiAnalysis TCO framework as a proof anchor? This is a preview of what Gate 2 will require next week.

By end of Week 5, the SA can...

Deploy and configure the full Nebius MLOps stack, handle a competitive displacement conversation against AWS or GCP using the SemiAnalysis TCO framework, and name three Nebius customers with their workload type and business result.

Week 06 TCO Modeling, Solution Design & Practitioner Certification Practitioner ✦ Gate 2

Key Activities

TCO modeling workshop — build a side-by-side cost comparison vs AWS for a realistic LLM pre-training workload using the SemiAnalysis framework. Document assumptions, GPU hours, storage, and networking costs.

Solution design practice — take a sample customer brief (mid-size AI startup, training a 7B parameter model) and produce a written solution design covering GPU selection, cluster configuration, storage architecture, and estimated TCO

Gate 2 Assessment: Live architecture demo to a mock customer panel (SA Manager + one senior SA). Present the solution design for the 7B model training scenario. 45 minutes including Q&A.

Review Gate 2 feedback and document personal development areas for Phase 3

People to Meet

SA Manager + Senior SA — Gate 2 mock customer panel assessors

Finance / RevOps — understand how Nebius pricing is structured and how SA-led TCO models feed into deal commercial structures

Deliverables Due This Week

✦ Gate 2: Architecture demo to mock panel passed Written TCO model: Nebius vs AWS for 7B param training Solution design document (7B model scenario) Phase 3 development focus areas documented

👤

Manager Action — Week 6 (Gate 2)

Conduct the Gate 2 live architecture demo with a senior SA as co-assessor. Use the standardized rubric: (1) Technical accuracy of GPU and cluster design, (2) Clarity of TCO model assumptions, (3) Ability to handle technical Q&A without breaking down. Written feedback within 24 hours. Gate 2 pass unlocks Phase 3 and first supervised customer call in Week 8.

By end of Week 6, the SA can...

Design a complete Nebius solution for a realistic customer scenario, present it to a technical panel, defend architectural choices under questioning, and build a credible TCO model. Gate 2 passed. Nebius Practitioner Badge earned.

Week 07 Token Factory — Production Inference Architecture Expert

Key Activities

Token Factory API deep-dive — model endpoint architecture, vLLM-optimized throughput, and how Nebius has optimized DeepSeek R1 inference for production use

Hands-on Lab 3: Architect a production inference system using Token Factory — configure authentication, invoke a model endpoint, set rate limits, implement monitoring, and measure time-to-first-token

Latency-optimized serving for reasoning models — when to recommend Token Factory (managed) vs self-managed vLLM on AI Cloud compute

Autoscaling inference endpoint design — handling variable production traffic patterns for customer AI applications

Study Brave Search inference architecture — 11M+ AI-generated answers daily at ~100% GPU utilization. How did they achieve it?

People to Meet

Token Factory product team — product road map briefing and inference architecture deep-dive

Brave Search account team — case study debrief on how the inference architecture was designed

Deliverables Due This Week

Lab 3: Production inference endpoint live and monitored Token Factory vs self-managed decision framework (personal notes) Brave Search architecture summary (personal study notes)

👤

Manager Action — Week 7

Ask the SA to demo the Token Factory endpoint they built in Lab 3 as if presenting to a customer CTO. Evaluate: can they explain the Token Factory vs self-managed tradeoff clearly? Are they comfortable with the vLLM/DeepSeek R1 context? Confirm Week 8 customer call is scheduled with a real AE partner.

By end of Week 7, the SA can...

Architect a production inference system on Token Factory, explain the managed vs self-managed inference tradeoff to a customer, and describe how Brave Search achieves ~100% GPU utilization on Nebius.

Week 08 First Live Customer Call — Supervised Observation Expert

Key Activities

First live customer call: Attend a real AE-led customer discovery call as an observer. Do not take the technical lead — observe how the AE qualifies the workload and when/how they introduce the SA

Post-call debrief with the AE partner — what signals did the customer give? What solution would you design? What would you have done differently?

Second live customer call — take the technical lead on a follow-up call with AE present. Answer technical questions, qualify workload depth, and propose next steps

RAG and Agentic Search solutions study — Nebius RAG architecture, embedding model selection, vector storage options, retrieval optimization patterns

Fine-tuning architecture review — QLoRA, PEFT, LoRA patterns on Nebius. When to recommend fine-tuning vs RAG vs full training.

People to Meet

AE partner — assigned Account Executive for live customer call co-sell experience

Customer(s) — first live customer interaction under SA Manager supervision

SA Manager — end-of-week coaching debrief on first customer call performance

Deliverables Due This Week

First customer call completed (observer role) Second customer call completed (technical lead role) Post-call solution design note submitted to manager RAG vs fine-tuning decision framework (personal notes)

👤

Manager Action — Week 8

Attend the second customer call where the SA takes the technical lead role. Evaluate: did they listen before proposing? Did they correctly identify the workload type? Did they position the right Nebius solution? Did they handle a technical question they didn't know the answer to gracefully? Structured coaching debrief within 48 hours of the call.

By end of Week 8, the SA can...

Lead the technical portion of a live customer discovery call, ask qualifying questions, correctly identify the customer's workload type, and propose a Nebius solution — without manager prompting.

Week 09 Demo Mastery — Building & Delivering the Nebius Demo Expert

Key Activities

Build a personal demo environment — a repeatable, customer-ready Nebius AI Cloud demo that can be tailored to training, fine-tuning, or inference scenarios in under 10 minutes of prep

Deliver three internal demo run-throughs — to SA Manager, to a senior SA, and to an AE partner. Incorporate feedback after each run.

Demo disaster recovery practice — practice handling a live failure (cluster not provisioning, latency spike, console error) without breaking the narrative for the customer

Study the Nebius Solution Library on GitHub — understand which Terraform recipes are most commonly used in customer POCs and which are relevant to your territory

Third live customer call — take full SA ownership of the technical conversation. No observer safety net.

People to Meet

AE partner — demo feedback session from a sales perspective: "what landed, what confused the customer"

Senior SA — advanced demo technique session and feedback on narrative structure

Customer — third live call; SA fully independent

Deliverables Due This Week

Personal demo environment built and documented Three internal demo run-throughs completed Demo feedback log maintained (one entry per run) Third live customer call completed independently

👤

Manager Action — Week 9

Watch one of the SA's internal demo run-throughs. Score against three criteria: (1) Technical accuracy — is the demo showing what it claims to show? (2) Narrative clarity — does the customer know why this matters at every step? (3) Disaster recovery — what happened when something went wrong in the run? Share written scoring before the AE partner demo run-through.

By end of Week 9, the SA can...

Deliver a polished, customer-ready Nebius AI Cloud demo tailored to any workload scenario, recover gracefully from a live technical failure, and lead a customer call fully independently without manager observation.

Week 10 Advanced Solution Design — Multi-Workload Customer Scenarios Expert

Key Activities

Complex scenario design practice — work through three advanced customer scenarios: (1) a pharmaceutical company needing HIPAA-compliant AI training in the EU, (2) a media company scaling stable diffusion inference to production, (3) a fintech building an agentic search system with RAG on Nebius

Physical AI and Robotics solution patterns — how Nebius's infrastructure supports simulation workloads and physical AI model training

Enterprise security and compliance deep-dive — tenant-level isolation, IAM architecture, and how to respond to an enterprise security questionnaire using the Trust Center

Prepare capstone brief — review the capstone scenario (provided by SA Manager at start of Week 11) and begin architecture planning

People to Meet

Security / Compliance lead — enterprise security questionnaire walk-through

Senior SA — complex scenario review and architecture feedback

SA Manager — capstone scenario briefing at end of week

Deliverables Due This Week

Three advanced scenario solution designs (written) Enterprise security Q&A using Trust Center completed Capstone scenario brief received and initial architecture sketch begun

👤

Manager Action — Week 10

Review the SA's three advanced scenario solution designs. Are the compliance recommendations accurate for the pharmaceutical EU scenario? Is the stable diffusion inference architecture optimal for the media scenario? Brief the SA on the capstone scenario at end of Week 10 — give them the full week to design before the Week 11 build begins.

By end of Week 10, the SA can...

Design a complete Nebius solution for a regulated-industry customer with EU data residency, handle an enterprise security questionnaire independently, and respond to advanced agentic AI and RAG architecture questions.

Week 11 Capstone Build — Full Solution Design for Realistic RFP Expert

Key Activities

Respond to the capstone RFP scenario (provided Week 10) — a realistic enterprise AI workload covering training, fine-tuning, and production inference requirements across multiple use cases

Produce a complete solution design document including: executive summary, architecture diagram, GPU selection rationale with Hopper vs Blackwell recommendation, cluster configuration, storage design, MLOps stack recommendation, and full TCO model vs AWS

Prepare the 30-minute customer presentation — slides, live demo environment, and anticipated Q&A responses. Rehearse twice before Week 12 delivery

Internal rehearsal with AE partner — simulate the full presentation including objections and competitive challenge from a mock AWS team

People to Meet

AE partner — rehearsal panel for capstone presentation; provide competitive challenge

Buddy SA — peer review of solution design document before final submission

SA Manager — mid-week check-in to ensure capstone is on track

Deliverables Due This Week

Capstone solution design document completed Architecture diagram finalized TCO model: Nebius vs AWS completed 30-minute presentation deck ready for Week 12 Internal rehearsal with AE partner completed

👤

Manager Action — Week 11

Review the capstone solution design document before Week 12 begins — do not wait until the presentation. Flag any technical inaccuracies, missing components, or weak sections so the SA has time to correct before the panel. Confirm the Gate 3 panel (SA Manager + VP Sales or CRO + one Senior SA) is scheduled and briefed on their assessor roles.

By end of Week 11, the SA can...

Produce a complete, presentation-ready solution design document for a complex enterprise AI workload — covering architecture, GPU selection, storage, MLOps, TCO, and compliance — without manager assistance.

Week 12 Expert Certification — Capstone Presentation & SA Graduation Expert ✦ Gate 3

Key Activities

Gate 3 Capstone Presentation: 30-minute solution presentation to the full panel (SA Manager, VP Sales or CRO, senior SA). Present the complete solution design, deliver a live Nebius demo, defend architectural choices under competitive challenge from the panel, and walk through the TCO model.

Panel debrief and structured feedback — written assessor scores across five dimensions: technical accuracy, solution completeness, demo quality, objection handling, and commercial awareness

Certified Nebius SA Expert ceremony — formal graduation, badge issuance, and first solo customer account assignment

90-day retrospective with SA Manager — what worked in the program, what to improve for the next cohort, and the SA's 30-60-90 growth plan for the next quarter

People to Meet

SA Manager — Gate 3 lead assessor and graduation ceremony

VP Sales or CRO — panel member; executive perspective on commercial relevance of the solution

Senior SA — panel member; technical depth assessor

First solo customer — account assigned post-certification; introductory call in Week 13

Deliverables Due This Week

✦ Gate 3: Capstone presentation to panel passed ✦ Certified Nebius SA — Expert Badge issued Written panel feedback received and reviewed 90-day retrospective completed with manager 30-60-90 growth plan for Q2 agreed First solo customer account assigned

👤

Manager Action — Week 12 (Gate 3 + Graduation)

Lead the Gate 3 capstone panel. Score across five dimensions using the standardized rubric. Deliver the certification badge and written assessment within 24 hours of the presentation. Complete the 90-day program retrospective — capture feedback on module quality, pacing, and lab relevance for use in the next SA onboarding cohort. The SA's 30-60-90 growth plan for Q2 should be agreed before end of week.

By end of Week 12, the SA is...

A Certified Nebius SA — Expert. Independently customer-ready. Assigned their first solo account. Capable of architecting, demonstrating, and defending any Nebius AI Cloud solution without manager support.

Program Milestones

Three certification gates separate phases. Gates require live demonstration — not assessments.

Week 1

Program Kickoff

Systems access, buddy SA assigned, 90-day plan agreed

Week 2 · End of Phase 1

Gate 1 — Foundation Badge

30-min architecture walkthrough with SA Manager. Pass to enter Phase 2.

Nebius Foundation Badge

Weeks 3–5

Technical Deep Build

GPU, orchestration, MLOps, competitive — hands-on labs every week

Week 6 · End of Phase 2

Gate 2 — Practitioner Badge

45-min live architecture demo to mock customer panel. Pass to enter Phase 3 and first supervised customer call.

Nebius Practitioner Badge

Weeks 7–11

Customer-Facing Ramp

Token Factory, live customer calls, demo mastery, complex solution design, capstone build

Week 12 · End of Phase 3

Gate 3 — Expert Certification

30-min capstone presentation to SA Manager, VP Sales, and senior SA panel. First solo account assigned on pass.

Certified Nebius SA — Expert

Stakeholder Map

Everyone the SA needs to meet in their first 90 days, and why.

SA Manager

Program owner. Conducts all three gate assessments. Weekly check-ins. Final certification authority.

Wk 1

Buddy SA (Assigned Peer)

Technical mentor. Two job-shadow sessions in Weeks 2–3. Available for informal Q&A throughout the 90 days.

Wk 1

AE Partner

First live customer call partner in Week 8. Capstone rehearsal panel. Ongoing co-sell relationship post-graduation.

Wk 8

Product Marketing Lead

Competitive intelligence briefing. Messaging alignment. Customer proof point context (Brave, Recraft, Wubble).

Wk 5

Senior SA / Principal SA

Architecture review sessions. Gate 2 mock panel co-assessor. Gate 3 panel technical assessor. Demo feedback.

Wk 2

VP Sales or CRO

Gate 3 capstone panel — provides executive commercial perspective. First exposure to the SA as a credible technical partner.

Wk 12

Token Factory Product Team

Product road map briefing. Inference architecture deep-dive. vLLM and DeepSeek R1 optimization context.

Wk 7

90-Day Deliverables Checklist

Every formal deliverable due across the 12-week program.

✦ Gate 1: Architecture walkthrough

Due: End of Week 2

Lab 1: H100 provisioned via console, CLI, Terraform

Due: Week 3

Lab 2: Multi-node K8s cluster deployed

Due: Week 4

Competitive displacement one-pager

Due: Week 5

TCO model: Nebius vs AWS (7B param training)

Due: Week 6

✦ Gate 2: Live architecture demo to mock panel

Due: End of Week 6

Lab 3: Token Factory inference endpoint live

Due: Week 7

First customer call (observer role)

Due: Week 8

Personal demo environment built & documented

Due: Week 9

Three advanced scenario solution designs

Due: Week 10

Capstone solution design document

Due: Week 11

✦ Gate 3: Capstone presentation to full panel

Due: Week 12

90-day retrospective + Q2 growth plan

Due: End of Week 12

Technical Environment Setup

Everything the SA needs provisioned before Day 1 of Week 1.

Nebius AI Cloud Console

Full console access with a dedicated sandbox project. Billing limit set for lab usage. Admin IAM role for the 90-day program.

Token Factory API Access

Developer API key with rate limits appropriate for lab work. Access to production model endpoints including DeepSeek R1.

GitHub — Nebius Solution Library

Contributor access to github.com/nebius/nebius-solution-library. Clone the repo locally before Week 3 lab work begins.

Terraform CLI + Nebius Provider

Terraform installed locally. Nebius Terraform provider configured with authentication. Validated against the console sandbox project.

kubectl + Helm

kubectl configured for the Nebius Managed Kubernetes sandbox cluster. Helm 3 installed for workload deployment in Week 4 lab.

Nebius CLI

Nebius CLI installed and authenticated. Required for Week 3 Lab 1 (provisioning comparison: console vs CLI vs Terraform).

CRM Access (Salesforce)

SA role access to Salesforce. Required from Week 8 onward for logging customer call notes and tracking deal involvement.

← Back to Nebius Projects

90-Day SAOnboarding Blueprint

90-Day SA
Onboarding Blueprint